Skip to content

Add assembly test for -Zreg-struct-return option #145382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

winstonallo
Copy link

@winstonallo winstonallo commented Aug 14, 2025

r? @tgross35

As discussed in #145309 with @tgross35 and @ojeda, I added assembly tests for the -Zreg-struct-return option verifying that it changes the ABI from hidden pointer to register-return on x86_32.

The test covers:

  • Direct struct construction, showing register return vs hidden pointer
  • External function calls returning structs, showing ABI mismatch handling

Different memory layouts affect ABI mismatch handling, but register returns use the same register allocation regardless of struct field layout (apart from the fact that they use smaller registers for smaller structs, of course).

Here is a compiler explorer with 2 examples. Let me know if there is anything more I could add. Since register returns only happen for structs up to the size of 2 registers, I figured testing the pivot value (8 bytes) would be most critical.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 14, 2025
@ojeda
Copy link
Contributor

ojeda commented Aug 14, 2025

Thanks for this!

  • Direct struct construction, showing register return vs hidden pointer

  • External function calls returning structs, showing ABI mismatch handling

I would suggest preserving these two lines in the tests themselves.

Since register returns only happen for structs up to the size of 2 registers, I figured testing the pivot value (8 bytes) would be most critical.

Yeah. It would also be a good idea to test the other side of that threshold, i.e. testing a size where it does not happen even with the flag enabled.

(Also, a smaller size that puts both fields in the same register like in your CE example wouldn't hurt either, but I am not sure what the threshold for "too many tests" is in Rust)

Comment on lines 56 to 58
// WITHOUT: addl $12, %esp
// WITHOUT: movl %esi, %eax
// WITHOUT: addl $8, %esp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I am not sure how strict we want to be matching the entire stream of instructions for this case -- it could be improved or shuffled a bit by the backend.

On the other hand, this may be easy to relax if a newer LLVM changes it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I was also wondering, it's definitely easy to relax later but would be nice to have a maintainer's opinion on this to make sure this test is not flaky. I am unfortunately not very familiar with how often LLVM changes this kind of stuff

Copy link
Author

@winstonallo winstonallo Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it would be fine to limit this to what differentiates it from the reg-struct-return assembly, so the calll and the retl $4, what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, like in the new ones you just added, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the code here was:

pub unsafe extern "C" fn builtin_call(numerator: i32, denominator: i32) -> div_t {
    // WITH-LABEL: builtin_call
    // WITH: jmp div
    // WITHOUT-LABEL: builtin_call
    // WITHOUT: pushl %esi
    // WITHOUT: subl $8, %esp
    // WITHOUT: movl 16(%esp), %esi
    // WITHOUT: subl $4, %esp
    // WITHOUT: pushl 28(%esp)
    // WITHOUT: pushl 28(%esp)
    // WITHOUT: pushl %esi
    // WITHOUT: calll div
    // WITHOUT: addl $12, %esp
    // WITHOUT: movl %esi, %eax
    // WITHOUT: addl $8, %esp
    // WITHOUT: popl %esi
    // WITHOUT: retl $4
    div(numerator, denominator)
}

Which doesn't exist anymore. For reference though, here's a way-too-detailed answer:

The above is testing a lot: both the caller side and the callee side, as well as how i32 gets passed. For ABI tests it's best to make each test function isolate one of those aspects, which gives you less to assert on. For checking returns usually that means returning some immediate value to test the callee side (like you have now), and then storing the result of a call (#145382 (comment)) to test the caller side. Then you basically only need to check that something gets written to the place you expect and can be more lax about how it gets there.

LLVM's repo has a lot more assertions about exact instruction sequences (using -NEXT), but these are usually autogenerated and run without optimizations so it's not as likely to be volatile. An example that's fresh in my head: https://github.com/llvm/llvm-project/blob/c22ec9cde3708e0c7afd0909508a67ef9625aa4c/llvm/test/CodeGen/X86/i128-fp128-abi.ll. We tend to avoid such tests in this repo because it's more painful to update when things do change, and LLVM covers ABI things pretty well.

For CHECK: directives where the ordering doesn't matter, FileCheck has CHECK-DAG. No need really to do this here, though it doesn't hurt if you feel like it (might make it possible to use the same test for the clif and gcc backends once/if those start getting asm tests).

It also has a cool feature where you can bind a regex to a name and then use it later, useful if you know it will use a scratch register but don't care which https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-dag-directive.

Organize tests into separate module for better legibility
@winstonallo
Copy link
Author

Yeah. It would also be a good idea to test the other side of that threshold, i.e. testing a size where it does not happen even with the flag enabled.

(Also, a smaller size that puts both fields in the same register like in your CE example wouldn't hurt either, but I am not sure what the threshold for "too many tests" is in Rust)

Added tests for both of those cases, we can still decide to remove them if they turn out to be too much
Add tests for < 8 and > 8 bytes structs

@ojeda
Copy link
Contributor

ojeda commented Aug 14, 2025

The tests look great now with the comments and the extra cases.

@winstonallo
Copy link
Author

I now also adapted the 8 bytes tests to be less rigid on the instruction stream, thank you for all the feedback!

Comment on lines +48 to +57
#[unsafe(no_mangle)]
pub unsafe extern "C" fn small_call() -> small_t {
// WITH-LABEL: small_call
// WITH: jmp small

// WITHOUT-LABEL: small_call
// WITHOUT: calll small
// WITHOUT: retl $4
small()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the assembly here is exact then that would indicate that LLVM expects the caller and callee to have the same ABI, which is good. But FileCheck only checks that lines exist in that order, not that there isn't anything between them, so that would need some WITH-NOT/WITHOUT-NOT (or -NEXT) to assert it isn't moving more things around.

However, for asm tests it's good to try to force the data to be moved in some way so you can observe it. If the function were changed to:

fn small_call(dst: &mut Ty) -> small_t {
    *dst = small();
}

Then you can assert that it does moves from specific registers to memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants